Normalization of Relative and Incomplete Temporal Expressions in Clinical Narratives
نویسندگان
چکیده
OBJECTIVE To improve the normalization of relative and incomplete temporal expressions (RI-TIMEXes) in clinical narratives. METHODS We analyzed the RI-TIMEXes in temporally annotated corpora and propose two hypotheses regarding the normalization of RI-TIMEXes in the clinical narrative domain: the anchor point hypothesis and the anchor relation hypothesis. We annotated the RI-TIMEXes in three corpora to study the characteristics of RI-TMEXes in different domains. This informed the design of our RI-TIMEX normalization system for the clinical domain, which consists of an anchor point classifier, an anchor relation classifier, and a rule-based RI-TIMEX text span parser. We experimented with different feature sets and performed an error analysis for each system component. RESULTS The annotation confirmed the hypotheses that we can simplify the RI-TIMEXes normalization task using two multi-label classifiers. Our system achieves anchor point classification, anchor relation classification, and rule-based parsing accuracy of 74.68%, 87.71%, and 57.2% (82.09% under relaxed matching criteria), respectively, on the held-out test set of the 2012 i2b2 temporal relation challenge. DISCUSSION Experiments with feature sets reveal some interesting findings, such as: the verbal tense feature does not inform the anchor relation classification in clinical narratives as much as the tokens near the RI-TIMEX. Error analysis showed that underrepresented anchor point and anchor relation classes are difficult to detect. CONCLUSIONS We formulate the RI-TIMEX normalization problem as a pair of multi-label classification problems. Considering only RI-TIMEX extraction and normalization, the system achieves statistically significant improvement over the RI-TIMEX results of the best systems in the 2012 i2b2 challenge.
منابع مشابه
MedTime: A temporal information extraction system for clinical narratives
Temporal information extraction from clinical narratives is of critical importance to many clinical applications. We participated in the EVENT/TIMEX3 track of the 2012 i2b2 clinical temporal relations challenge, and presented our temporal information extraction system, MedTime. MedTime comprises a cascade of rule-based and machine-learning pattern recognition procedures. It achieved a micro-ave...
متن کاملRecognizing Temporal Information in Korean Clinical Narratives through Text Normalization
OBJECTIVES Acquiring temporal information is important because knowledge in clinical narratives is time-sensitive. In this paper, we describe an approach that can be used to extract the temporal information found in Korean clinical narrative texts. METHODS We developed a two-stage system, which employs an exhaustive text analysis phase and a temporal expression recognition phase. Since our ta...
متن کاملWikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions
Temporal information plays an important role in many natural language processing and understanding tasks. Therefore, the extraction and normalization of temporal expressions from documents are crucial preprocessing steps in these research areas, and several temporal taggers have been developed in the past. The quality of such temporal taggers is usually evaluated using annotated corpora as gold...
متن کاملEvaluating temporal relations in clinical text: 2012 i2b2 Challenge
BACKGROUND The Sixth Informatics for Integrating Biology and the Bedside (i2b2) Natural Language Processing Challenge for Clinical Records focused on the temporal relations in clinical narratives. The organizers provided the research community with a corpus of discharge summaries annotated with temporal information, to be used for the development and evaluation of temporal reasoning systems. 18...
متن کاملAutomatic Extraction of Time Expressions Accross Domains in French Narratives
The prevalence of temporal references across all types of natural language utterances makes temporal analysis a key issue in Natural Language Processing. This work adresses three research questions: 1/is temporal expression recognition specific to a particular domain? 2/if so, can we characterize domain specificity? and 3/how can subdomain specificity be integrated in a single tool for unified ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره 22 5 شماره
صفحات -
تاریخ انتشار 2015